113 research outputs found

    Pancancer analysis of DNA methylation-driven genes using MethylMix.

    Get PDF
    Aberrant DNA methylation is an important mechanism that contributes to oncogenesis. Yet, few algorithms exist that exploit this vast dataset to identify hypo- and hypermethylated genes in cancer. We developed a novel computational algorithm called MethylMix to identify differentially methylated genes that are also predictive of transcription. We apply MethylMix to 12 individual cancer sites, and additionally combine all cancer sites in a pancancer analysis. We discover pancancer hypo- and hypermethylated genes and identify novel methylation-driven subgroups with clinical implications. MethylMix analysis on combined cancer sites reveals 10 pancancer clusters reflecting new similarities across malignantly transformed tissues

    Bayesian Gene Set Analysis

    Full text link
    Gene expression microarray technologies provide the simultaneous measurements of a large number of genes. Typical analyses of such data focus on the individual genes, but recent work has demonstrated that evaluating changes in expression across predefined sets of genes often increases statistical power and produces more robust results. We introduce a new methodology for identifying gene sets that are differentially expressed under varying experimental conditions. Our approach uses a hierarchical Bayesian framework where a hyperparameter measures the significance of each gene set. Using simulated data, we compare our proposed method to alternative approaches, such as Gene Set Enrichment Analysis (GSEA) and Gene Set Analysis (GSA). Our approach provides the best overall performance. We also discuss the application of our method to experimental data based on p53 mutation status

    Discovering Biological Progression Underlying Microarray Samples

    Get PDF
    In biological systems that undergo processes such as differentiation, a clear concept of progression exists. We present a novel computational approach, called Sample Progression Discovery (SPD), to discover patterns of biological progression underlying microarray gene expression data. SPD assumes that individual samples of a microarray dataset are related by an unknown biological process (i.e., differentiation, development, cell cycle, disease progression), and that each sample represents one unknown point along the progression of that process. SPD aims to organize the samples in a manner that reveals the underlying progression and to simultaneously identify subsets of genes that are responsible for that progression. We demonstrate the performance of SPD on a variety of microarray datasets that were generated by sampling a biological process at different points along its progression, without providing SPD any information of the underlying process. When applied to a cell cycle time series microarray dataset, SPD was not provided any prior knowledge of samples' time order or of which genes are cell-cycle regulated, yet SPD recovered the correct time order and identified many genes that have been associated with the cell cycle. When applied to B-cell differentiation data, SPD recovered the correct order of stages of normal B-cell differentiation and the linkage between preB-ALL tumor cells with their cell origin preB. When applied to mouse embryonic stem cell differentiation data, SPD uncovered a landscape of ESC differentiation into various lineages and genes that represent both generic and lineage specific processes. When applied to a prostate cancer microarray dataset, SPD identified gene modules that reflect a progression consistent with disease stages. SPD may be best viewed as a novel tool for synthesizing biological hypotheses because it provides a likely biological progression underlying a microarray dataset and, perhaps more importantly, the candidate genes that regulate that progression

    Extracting binary signals from microarray time-course data

    Get PDF
    This article presents a new method for analyzing microarray time courses by identifying genes that undergo abrupt transitions in expression level, and the time at which the transitions occur. The algorithm matches the sequence of expression levels for each gene against temporal patterns having one or two transitions between two expression levels. The algorithm reports a P-value for the matching pattern of each gene, and a global false discovery rate can also be computed. After matching, genes can be sorted by the direction and time of transitions. Genes can be partitioned into sets based on the direction and time of change for further analysis, such as comparison with Gene Ontology annotations or binding site motifs. The method is evaluated on simulated and actual time-course data. On microarray data for budding yeast, it is shown that the groups of genes that change in similar ways and at similar times have significant and relevant Gene Ontology annotations

    Extracting binary signals from microarray time-course data

    Get PDF
    This article presents a new method for analyzing microarray time courses by identifying genes that undergo abrupt transitions in expression level, and the time at which the transitions occur. The algorithm matches the sequence of expression levels for each gene against temporal patterns having one or two transitions between two expression levels. The algorithm reports a P-value for the matching pattern of each gene, and a global false discovery rate can also be computed. After matching, genes can be sorted by the direction and time of transitions. Genes can be partitioned into sets based on the direction and time of change for further analysis, such as comparison with Gene Ontology annotations or binding site motifs. The method is evaluated on simulated and actual time-course data. On microarray data for budding yeast, it is shown that the groups of genes that change in similar ways and at similar times have significant and relevant Gene Ontology annotations

    Boolean implication networks derived from large scale, whole genome microarray datasets

    Get PDF
    A method for analysis of microarray data is presented that extracts statistically significant Boolean implication relationships between pairs of genes

    The impact of overdiagnosis on the selection of efficient lung cancer screening strategies

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/136362/1/ijc30602_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/136362/2/ijc30602.pd
    corecore